- Console
- Change Kernel…
- Clear Console Cells
- Close and Shut Down…
- Insert Line Break
- Interrupt Kernel
- New Console
- Restart Kernel…
- Run Cell (forced)
- Run Cell (unforced)
- Show All Kernel Activity
- Extension Manager
- Enable Extension Manager
- File Operations
- Autosave Documents
- DownloadDownload the file to your computer
- Open from Path…Open from path
- Reload Notebook from DiskReload contents from disk
- Revert Notebook to CheckpointRevert contents to previous checkpoint
- Save NotebookSave and create checkpoint⌘ S
- Save Notebook As…Save with new path⇧ ⌘ S
- Show Active File in File Browser
- Trust HTML File
- Help
- Jupyter Reference
- JupyterLab FAQ
- JupyterLab Reference
- Launch Classic Notebook
- Markdown Reference
- Reset Application State
- Image Viewer
- Flip image horizontallyH
- Flip image verticallyV
- Invert ColorsI
- Reset Image0
- Rotate Clockwise]
- Rotate Counterclockwise[
- Zoom In=
- Zoom Out-
- Kernel Operations
- Shut Down All Kernels…
- Launcher
- New Launcher
- Main Area
- Activate Next Tab⌃ ⇧ ]
- Activate Next Tab Bar⌃ ⇧ .
- Activate Previous Tab⌃ ⇧ [
- Activate Previous Tab Bar⌃ ⇧ ,
- Activate Previously Used Tab⇧ ⌘ '
- Close All Other Tabs
- Close All Tabs
- Close Tab⌥ W
- Close Tabs to Right
- Find Next⌘ G
- Find Previous⇧ ⌘ G
- Find…⌘ F
- Log OutLog out of JupyterLab
- Presentation Mode
- Show Left Sidebar⌘ B
- Show Log Console
- Show Status Bar
- Shut DownShut down JupyterLab
- Single-Document Mode⇧ ⌘ D
- Notebook Cell Operations
- Change to Code Cell TypeY
- Change to Heading 11
- Change to Heading 22
- Change to Heading 33
- Change to Heading 44
- Change to Heading 55
- Change to Heading 66
- Change to Markdown Cell TypeM
- Change to Raw Cell TypeR
- Clear Outputs
- Collapse All Code
- Collapse All Outputs
- Collapse Selected Code
- Collapse Selected Outputs
- Copy CellsC
- Cut CellsX
- Delete CellsD, D
- Disable Scrolling for Outputs
- Enable Scrolling for Outputs
- Expand All Code
- Expand All Outputs
- Expand Selected Code
- Expand Selected Outputs
- Extend Selection Above⇧ K
- Extend Selection Below⇧ J
- Extend Selection to Bottom⇧ End
- Extend Selection to Top⇧ Home
- Insert Cell AboveA
- Insert Cell BelowB
- Merge Selected Cells⇧ M
- Move Cells Down
- Move Cells Up
- Paste Cells Above
- Paste Cells and Replace
- Paste Cells BelowV
- Redo Cell Operation⇧ Z
- Run Selected Cells⇧ Enter
- Run Selected Cells and Don't Advance⌃ Enter
- Run Selected Cells and Insert Below⌥ Enter
- Run Selected Text or Current Line in Console
- Select Cell AboveK
- Select Cell BelowJ
- Split Cell⌃ ⇧ -
- Undo Cell OperationZ
- Notebook Operations
- Change Kernel…
- Clear All Outputs
- Close and Shut Down
- Deselect All Cells
- Enter Command Mode⌃ M
- Enter Edit ModeEnter
- Export Notebook to Asciidoc
- Export Notebook to Executable Script
- Export Notebook to HTML
- Export Notebook to LaTeX
- Export Notebook to Markdown
- Export Notebook to PDF
- Export Notebook to ReStructured Text
- Export Notebook to Reveal.js Slides
- Export Notebook to Webpdf
- Interrupt Kernel
- New Console for Notebook
- New NotebookCreate a new notebook
- Reconnect To Kernel
- Render All Markdown Cells
- Restart Kernel and Clear All Outputs…
- Restart Kernel and Run All Cells…
- Restart Kernel and Run up to Selected Cell…
- Restart Kernel…
- Run All Above Selected Cell
- Run All Cells
- Run Selected Cell and All Below
- Select All Cells⌘ A
- Toggle All Line Numbers⇧ L
- Trust Notebook
- Settings
- Advanced Settings Editor⌘ ,
- Show Contextual Help
- Show Contextual HelpLive updating code documentation from the active kernel⌘ I
- Terminal
- Decrease Terminal Font Size
- Increase Terminal Font Size
- New TerminalStart a new terminal session
- Refresh TerminalRefresh the current terminal session
- Use Dark Terminal ThemeSet the terminal theme
- Use Inherit Terminal ThemeSet the terminal theme
- Use Light Terminal ThemeSet the terminal theme
- Text Editor
- Decrease Font Size
- Increase Font Size
- Indent with Tab
- New Markdown FileCreate a new markdown file
- New Text FileCreate a new text file
- Spaces: 1
- Spaces: 2
- Spaces: 4
- Spaces: 8
- Theme
- Decrease Code Font Size
- Decrease Content Font Size
- Decrease UI Font Size
- Increase Code Font Size
- Increase Content Font Size
- Increase UI Font Size
- Theme Scrollbars
- Use JupyterLab Dark Theme
- Use JupyterLab Light Theme
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxKernel Sessions
- Untitled.ipynb
- financial_forecasting.ipynb
- culmination-03.ipynb
- composing_masterpieces.ipynb
- plotting_w_plotly.ipynb
- portfolio_forecasting.ipynb
- indexing_fever.ipynb
- risky_business.ipynb
- portfolio_planner_part_2.ipynb
- portfolio_planner_part_1.ipynb
Terminal Sessions
- Untitled.ipynb
- portfolio_planner_part_2.ipynb
- portfolio_planner_part_1.ipynb
- financial_forecasting.ipynb
- portfolio_forecasting.ipynb
- risky_business.ipynb
- composing_masterpieces.ipynb
- plotting_w_plotly.ipynb
- culmination-03.ipynb
- indexing_fever.ipynb
- PresentationGuidelines.md3 days ago
- PresentationRequirements.md3 days ago
- ProjectGuidelines.md3 days ago
- TechnicalRequirements.md3 days ago
xxxxxxxxxx- Untitled.ipynb
- portfolio_planner_part_2.ipynb
- portfolio_planner_part_1.ipynb
xxxxxxxxxxStudent Activity: Financial Forecasting¶
In this activity, Harold's manager wants Harold to take a look at one year's worth of TSLA stock prices and plot a potential stock trajectory for where TSLA stock prices could go in the next 3 years. In addition, he would like to know how a $10,000 investment would perform given the simulated results.
Help Harold by creating a Monte Carlo simulation that simulates the next 252 * 3 trading days using three years worth of TSLA stock data. Plot the simulated results of TSLA daily returns over the next 3 years as well as the corresponding simulated outcomes.
x
# Import libraries and dependenciesimport osimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport alpaca_trade_api as tradeapifrom MCForecastTools import MCSimulation%matplotlib inline--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) <ipython-input-1-5d57e9b5bad8> in <module> 4 import pandas as pd 5 import matplotlib.pyplot as plt ----> 6 import alpaca_trade_api as tradeapi 7 from MCForecastTools import MCSimulation 8 ModuleNotFoundError: No module named 'alpaca_trade_api'
# Load .env enviroment variablesfrom dotenv import load_dotenvload_dotenv()# Set Alpaca API key and secretalpaca_api_key = os.getenv("ALPACA_API_KEY")alpaca_secret_key = os.getenv("ALPACA_SECRET_KEY")api = tradeapi.REST( alpaca_api_key, alpaca_secret_key, api_version = "v2")xxxxxxxxxxGet 3 Years Worth of Data via API Call and Read in as DataFrame¶
# Set the tickerticker = "TSLA"# Set timeframe to '1D'timeframe = "1D"# Set start and end datetimes of 3 years from Todaystart_date = pd.Timestamp("2017-05-04", tz="America/New_York").isoformat()end_date = pd.Timestamp("2020-05-04", tz="America/New_York").isoformat()# Get 3 years worth of historical data for TSLAticker_data = api.get_barset( ticker, timeframe, start=start_date, end=end_date, limit=1000,).dfticker_data.head()| TSLA | |||||
|---|---|---|---|---|---|
| open | high | low | close | volume | |
| 2017-05-04 00:00:00-04:00 | 307.435 | 307.77 | 290.7601 | 295.36 | 11653966 |
| 2017-05-05 00:00:00-04:00 | 298.000 | 308.55 | 296.8000 | 308.35 | 6695227 |
| 2017-05-08 00:00:00-04:00 | 310.600 | 313.79 | 305.8200 | 307.20 | 5969091 |
| 2017-05-09 00:00:00-04:00 | 309.380 | 321.99 | 309.1000 | 321.27 | 8158739 |
| 2017-05-10 00:00:00-04:00 | 321.560 | 325.40 | 318.1200 | 325.22 | 4754047 |
xxxxxxxxxxRun the Monte Carlo Simulation¶
# Set number of simulationsnum_sims = 1000# Configure a Monte Carlo simulation to forecast three years daily returnsMC_TSLA = MCSimulation( portfolio_data = ticker_data, num_simulation = num_sims, num_trading_days = 252*3)# Run Monte Carlo simulations to forecast three years daily returnsMC_TSLA.calc_cumulative_return()Running Monte Carlo simulation number 0. Running Monte Carlo simulation number 10. Running Monte Carlo simulation number 20. Running Monte Carlo simulation number 30. Running Monte Carlo simulation number 40. Running Monte Carlo simulation number 50. Running Monte Carlo simulation number 60. Running Monte Carlo simulation number 70. Running Monte Carlo simulation number 80. Running Monte Carlo simulation number 90. Running Monte Carlo simulation number 100. Running Monte Carlo simulation number 110. Running Monte Carlo simulation number 120. Running Monte Carlo simulation number 130. Running Monte Carlo simulation number 140. Running Monte Carlo simulation number 150. Running Monte Carlo simulation number 160. Running Monte Carlo simulation number 170. Running Monte Carlo simulation number 180. Running Monte Carlo simulation number 190. Running Monte Carlo simulation number 200. Running Monte Carlo simulation number 210. Running Monte Carlo simulation number 220. Running Monte Carlo simulation number 230. Running Monte Carlo simulation number 240. Running Monte Carlo simulation number 250. Running Monte Carlo simulation number 260. Running Monte Carlo simulation number 270. Running Monte Carlo simulation number 280. Running Monte Carlo simulation number 290. Running Monte Carlo simulation number 300. Running Monte Carlo simulation number 310. Running Monte Carlo simulation number 320. Running Monte Carlo simulation number 330. Running Monte Carlo simulation number 340. Running Monte Carlo simulation number 350. Running Monte Carlo simulation number 360. Running Monte Carlo simulation number 370. Running Monte Carlo simulation number 380. Running Monte Carlo simulation number 390. Running Monte Carlo simulation number 400. Running Monte Carlo simulation number 410. Running Monte Carlo simulation number 420. Running Monte Carlo simulation number 430. Running Monte Carlo simulation number 440. Running Monte Carlo simulation number 450. Running Monte Carlo simulation number 460. Running Monte Carlo simulation number 470. Running Monte Carlo simulation number 480. Running Monte Carlo simulation number 490. Running Monte Carlo simulation number 500. Running Monte Carlo simulation number 510. Running Monte Carlo simulation number 520. Running Monte Carlo simulation number 530. Running Monte Carlo simulation number 540. Running Monte Carlo simulation number 550. Running Monte Carlo simulation number 560. Running Monte Carlo simulation number 570. Running Monte Carlo simulation number 580. Running Monte Carlo simulation number 590. Running Monte Carlo simulation number 600. Running Monte Carlo simulation number 610. Running Monte Carlo simulation number 620. Running Monte Carlo simulation number 630. Running Monte Carlo simulation number 640. Running Monte Carlo simulation number 650. Running Monte Carlo simulation number 660. Running Monte Carlo simulation number 670. Running Monte Carlo simulation number 680. Running Monte Carlo simulation number 690. Running Monte Carlo simulation number 700. Running Monte Carlo simulation number 710. Running Monte Carlo simulation number 720. Running Monte Carlo simulation number 730. Running Monte Carlo simulation number 740. Running Monte Carlo simulation number 750. Running Monte Carlo simulation number 760. Running Monte Carlo simulation number 770. Running Monte Carlo simulation number 780. Running Monte Carlo simulation number 790. Running Monte Carlo simulation number 800. Running Monte Carlo simulation number 810. Running Monte Carlo simulation number 820. Running Monte Carlo simulation number 830. Running Monte Carlo simulation number 840. Running Monte Carlo simulation number 850. Running Monte Carlo simulation number 860. Running Monte Carlo simulation number 870. Running Monte Carlo simulation number 880. Running Monte Carlo simulation number 890. Running Monte Carlo simulation number 900. Running Monte Carlo simulation number 910. Running Monte Carlo simulation number 920. Running Monte Carlo simulation number 930. Running Monte Carlo simulation number 940. Running Monte Carlo simulation number 950. Running Monte Carlo simulation number 960. Running Monte Carlo simulation number 970. Running Monte Carlo simulation number 980. Running Monte Carlo simulation number 990.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 990 | 991 | 992 | 993 | 994 | 995 | 996 | 997 | 998 | 999 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
| 1 | 0.981222 | 0.990782 | 1.001497 | 0.968636 | 1.025662 | 0.985579 | 0.979159 | 0.959626 | 0.915200 | 1.100719 | ... | 1.035320 | 0.974333 | 1.129787 | 0.991942 | 1.026417 | 1.070166 | 1.043902 | 1.019487 | 0.967719 | 0.982966 |
| 2 | 0.979845 | 0.932936 | 0.950812 | 1.001255 | 0.978447 | 0.956805 | 0.971956 | 0.959406 | 0.916128 | 1.014008 | ... | 1.066326 | 0.974253 | 1.142736 | 1.009477 | 1.010455 | 1.083022 | 1.034498 | 1.023799 | 0.886175 | 0.955799 |
| 3 | 0.982248 | 0.958489 | 0.907635 | 1.076008 | 0.924043 | 0.941103 | 0.972826 | 0.994823 | 0.953968 | 1.040536 | ... | 1.099464 | 0.990134 | 1.172254 | 0.948681 | 0.967827 | 1.055193 | 1.048936 | 1.023842 | 0.954594 | 0.966601 |
| 4 | 1.007947 | 0.994389 | 0.863432 | 1.100683 | 0.928010 | 0.884032 | 0.965998 | 0.947342 | 0.920592 | 1.079854 | ... | 1.135327 | 0.972608 | 1.139030 | 0.909896 | 0.982004 | 1.046113 | 0.950495 | 0.958808 | 0.951463 | 0.908549 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 752 | 2.964285 | 6.281219 | 3.528422 | 5.569597 | 1.537283 | 1.927135 | 1.835281 | 2.273169 | 4.462258 | 6.555616 | ... | 2.991133 | 3.373659 | 3.035196 | 1.345995 | 0.360124 | 34.367965 | 1.629844 | 8.824132 | 3.348889 | 10.723148 |
| 753 | 2.898127 | 5.929781 | 3.630850 | 5.614963 | 1.532372 | 1.913588 | 1.905665 | 2.160138 | 4.445566 | 6.939099 | ... | 3.156057 | 3.510643 | 3.030412 | 1.298904 | 0.357633 | 34.687223 | 1.666808 | 9.236994 | 3.099606 | 10.409932 |
| 754 | 2.805524 | 5.824232 | 3.821479 | 6.019107 | 1.466395 | 1.944477 | 1.955238 | 2.174833 | 4.450806 | 6.638072 | ... | 3.260135 | 3.581977 | 2.908168 | 1.199393 | 0.386841 | 36.692925 | 1.657787 | 9.105443 | 3.114519 | 10.542502 |
| 755 | 2.719517 | 5.297904 | 3.993555 | 6.214585 | 1.463815 | 1.966690 | 1.964821 | 2.180150 | 4.592842 | 6.726524 | ... | 3.448536 | 3.769620 | 3.132997 | 1.138237 | 0.379576 | 37.370551 | 1.649728 | 9.167870 | 3.188418 | 10.650525 |
| 756 | 2.812057 | 5.466324 | 3.981998 | 6.201528 | 1.494064 | 2.016463 | 1.833008 | 2.172569 | 4.570772 | 6.460053 | ... | 3.498799 | 3.943865 | 3.263377 | 1.214728 | 0.371717 | 37.689524 | 1.629001 | 9.286477 | 3.326631 | 10.344296 |
757 rows × 1000 columns
xxxxxxxxxxPlot the Simulated Daily Returns Trajectory for TSLA over the Next Year (252 Trading Days)¶
# Compute summary statistics from the simulated daily returnssimulated_returns_data = { "mean": list(MC_TSLA.simulated_return.mean(axis=1)), "median": list(MC_TSLA.simulated_return.median(axis=1)), "min": list(MC_TSLA.simulated_return.min(axis=1)), "max": list(MC_TSLA.simulated_return.max(axis=1))}# Create a DataFrame with the summary statisticsdf_simulated_returns = pd.DataFrame(simulated_returns_data)# Display sample datadf_simulated_returns.head()| mean | median | min | max | |
|---|---|---|---|---|
| 0 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
| 1 | 1.002953 | 1.002788 | 0.856863 | 1.129787 |
| 2 | 1.005907 | 1.005676 | 0.826181 | 1.203377 |
| 3 | 1.007616 | 1.006666 | 0.801393 | 1.232933 |
| 4 | 1.008332 | 1.004927 | 0.768805 | 1.294849 |
# Use the `plot` function to visually analyze the trajectory of TSLA stock daily returns on the next three years of trading days simulationdf_simulated_returns.plot(title="Simulated Daily Returns Behavior of TSLA Stock Over the Next Year")<matplotlib.axes._subplots.AxesSubplot at 0x7faaf8c88610>
xxxxxxxxxxCalculate the Simulated Profits/Losses of $10,000 Investment in TSLA Over the Next Three Years¶
# Set initial investmentinitial_investment = 10000# Multiply an initial investment by the daily returns of simulative stock prices to return the progression of daily returns in terms of moneycumulative_pnl = initial_investment * df_simulated_returns# Display sample datacumulative_pnl.head()| mean | median | min | max | |
|---|---|---|---|---|
| 0 | 10000.000000 | 10000.000000 | 10000.000000 | 10000.000000 |
| 1 | 10029.532050 | 10027.879860 | 8568.632158 | 11297.874167 |
| 2 | 10059.066285 | 10056.758964 | 8261.814625 | 12033.766254 |
| 3 | 10076.159026 | 10066.656818 | 8013.931073 | 12329.325621 |
| 4 | 10083.317080 | 10049.269047 | 7688.053281 | 12948.490424 |
xxxxxxxxxxPlot the Simulated Profits/Losses of $10,000 Investment in TSLA Over the Next 252 Trading Days¶
# Use the 'plot' function to create a chart of the simulated profits/lossescumulative_pnl.plot(title="Simulated Outcomes Behavior of TSLA Stock Over the Next Year")<matplotlib.axes._subplots.AxesSubplot at 0x7faaf2e62190>
xxxxxxxxxxCalculate the range of the possible outcomes of our $10,000 investments in TSLA stocks¶
# Fetch summary statistics from the Monte Carlo simulation resultstbl = MC_TSLA.summarize_cumulative_return()# Print summary statisticsprint(tbl)count 1000.000000 mean 4.906131 std 6.899562 min 0.126797 25% 1.232316 50% 2.655507 75% 6.011975 max 85.991793 95% CI Lower 0.371666 95% CI Upper 23.772773 Name: 756, dtype: float64
# Use the lower and upper `95%` confidence intervals to calculate the range of the possible outcomes of our $10,000 investments in TSLA stocksci_lower = round(tbl[8]*10000,2)ci_upper = round(tbl[9]*10000,2)# Print resultsprint(f"There is a 95% chance that an initial investment of $10,000 in the portfolio" f" over the next year will end within in the range of" f" ${ci_lower} and ${ci_upper}.")There is a 95% chance that an initial investment of $10,000 in the portfolio over the next year will end within in the range of $3716.66 and $237727.73.
xxxxxxxxxxWorld Happiness¶
xxxxxxxxxx1. Import Libraries and Dependencies¶
# Import dependenciesimport pandas as pdxxxxxxxxxx2. Set File Path to CSV using Pathlib¶
# Create reference to CSV file# Hint - the file path may require some extra steps..csv_path = "../Resources/world_happiness.csv"xxxxxxxxxx3. Import CSV into Pandas DataFrame¶
# Import the CSV into a pandas DataFramehappiness_df = pd.read_csv(csv_path, low_memory=False)happiness_df| Country | Happiness.Rank | Happiness.Score | Whisker.high | Whisker.low | Economy..GDP.per.Capita. | Family | Health..Life.Expectancy. | Freedom | Generosity | Trust..Government.Corruption. | Dystopia.Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Norway | 1 | 7.537 | 7.594445 | 7.479556 | 1.616463 | 1.533524 | 0.796667 | 0.635423 | 0.362012 | 0.315964 | 2.277027 |
| 1 | Denmark | 2 | 7.522 | 7.581728 | 7.462272 | 1.482383 | 1.551122 | 0.792566 | 0.626007 | 0.355280 | 0.400770 | 2.313707 |
| 2 | Iceland | 3 | 7.504 | 7.622030 | 7.385970 | 1.480633 | 1.610574 | 0.833552 | 0.627163 | 0.475540 | 0.153527 | 2.322715 |
| 3 | Switzerland | 4 | 7.494 | 7.561772 | 7.426227 | 1.564980 | 1.516912 | 0.858131 | 0.620071 | 0.290549 | 0.367007 | 2.276716 |
| 4 | Finland | 5 | 7.469 | 7.527542 | 7.410458 | 1.443572 | 1.540247 | 0.809158 | 0.617951 | 0.245483 | 0.382612 | 2.430182 |
| 5 | Netherlands | 6 | 7.377 | 7.427426 | 7.326574 | 1.503945 | 1.428939 | 0.810696 | 0.585384 | 0.470490 | 0.282662 | 2.294804 |
| 6 | Canada | 7 | 7.316 | 7.384403 | 7.247597 | 1.479204 | 1.481349 | 0.834558 | 0.611101 | 0.435540 | 0.287372 | 2.187264 |
| 7 | New Zealand | 8 | 7.314 | 7.379510 | 7.248490 | 1.405706 | 1.548195 | 0.816760 | 0.614062 | 0.500005 | 0.382817 | 2.046456 |
| 8 | Sweden | 9 | 7.284 | 7.344095 | 7.223905 | 1.494387 | 1.478162 | 0.830875 | 0.612924 | 0.385399 | 0.384399 | 2.097538 |
| 9 | Australia | 10 | 7.284 | 7.356651 | 7.211349 | 1.484415 | 1.510042 | 0.843887 | 0.601607 | 0.477699 | 0.301184 | 2.065211 |
| 10 | Israel | 11 | 7.213 | 7.279853 | 7.146146 | 1.375382 | 1.376290 | 0.838404 | 0.405989 | 0.330083 | 0.085242 | 2.801757 |
| 11 | Costa Rica | 12 | 7.079 | 7.168112 | 6.989888 | 1.109706 | 1.416404 | 0.759509 | 0.580132 | 0.214613 | 0.100107 | 2.898639 |
| 12 | Austria | 13 | 7.006 | 7.070670 | 6.941330 | 1.487097 | 1.459945 | 0.815328 | 0.567766 | 0.316472 | 0.221060 | 2.138506 |
| 13 | United States | 14 | 6.993 | 7.074657 | 6.911343 | 1.546259 | 1.419921 | 0.774287 | 0.505741 | 0.392579 | 0.135639 | 2.218113 |
| 14 | Ireland | 15 | 6.977 | 7.043352 | 6.910649 | 1.535707 | 1.558231 | 0.809783 | 0.573110 | 0.427858 | 0.298388 | 1.773869 |
| 15 | Germany | 16 | 6.951 | 7.005382 | 6.896619 | 1.487923 | 1.472520 | 0.798951 | 0.562511 | 0.336269 | 0.276732 | 2.015770 |
| 16 | Belgium | 17 | 6.891 | 6.955821 | 6.826179 | 1.463781 | 1.462313 | 0.818092 | 0.539771 | 0.231503 | 0.251343 | 2.124210 |
| 17 | Luxembourg | 18 | 6.863 | 6.923686 | 6.802314 | 1.741944 | 1.457584 | 0.845089 | 0.596628 | 0.283181 | 0.318834 | 1.619512 |
| 18 | United Kingdom | 19 | 6.714 | 6.783792 | 6.644209 | 1.441634 | 1.496460 | 0.805336 | 0.508190 | 0.492774 | 0.265428 | 1.704144 |
| 19 | Chile | 20 | 6.652 | 6.739251 | 6.564749 | 1.252785 | 1.284025 | 0.819480 | 0.376895 | 0.326662 | 0.082288 | 2.509586 |
| 20 | United Arab Emirates | 21 | 6.648 | 6.722047 | 6.573952 | 1.626343 | 1.266410 | 0.726798 | 0.608345 | 0.360942 | 0.324490 | 1.734704 |
| 21 | Brazil | 22 | 6.635 | 6.725470 | 6.544531 | 1.107353 | 1.431306 | 0.616552 | 0.437454 | 0.162350 | 0.111093 | 2.769267 |
| 22 | Czech Republic | 23 | 6.609 | 6.683862 | 6.534138 | 1.352682 | 1.433885 | 0.754444 | 0.490946 | 0.088107 | 0.036873 | 2.451862 |
| 23 | Argentina | 24 | 6.599 | 6.690085 | 6.507915 | 1.185295 | 1.440451 | 0.695137 | 0.494519 | 0.109457 | 0.059740 | 2.614005 |
| 24 | Mexico | 25 | 6.578 | 6.671149 | 6.484851 | 1.153184 | 1.210862 | 0.709979 | 0.412730 | 0.120990 | 0.132774 | 2.837155 |
| 25 | Singapore | 26 | 6.572 | 6.636723 | 6.507277 | 1.692278 | 1.353814 | 0.949492 | 0.549841 | 0.345966 | 0.464308 | 1.216362 |
| 26 | Malta | 27 | 6.527 | 6.598397 | 6.455603 | 1.343280 | 1.488412 | 0.821944 | 0.588767 | 0.574731 | 0.153066 | 1.556863 |
| 27 | Uruguay | 28 | 6.454 | 6.545906 | 6.362094 | 1.217560 | 1.412228 | 0.719217 | 0.579392 | 0.175097 | 0.178062 | 2.172410 |
| 28 | Guatemala | 29 | 6.454 | 6.566874 | 6.341126 | 0.872002 | 1.255585 | 0.540240 | 0.531311 | 0.283488 | 0.077223 | 2.893891 |
| 29 | Panama | 30 | 6.452 | 6.557131 | 6.346870 | 1.233748 | 1.373193 | 0.706156 | 0.550027 | 0.210557 | 0.070984 | 2.307200 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 125 | Congo (Kinshasa) | 126 | 4.280 | 4.357811 | 4.202190 | 0.092102 | 1.229023 | 0.191407 | 0.235961 | 0.246456 | 0.060241 | 2.224959 |
| 126 | Mali | 127 | 4.190 | 4.269671 | 4.110329 | 0.476180 | 1.281473 | 0.169366 | 0.306614 | 0.183354 | 0.104970 | 1.668191 |
| 127 | Ivory Coast | 128 | 4.180 | 4.275183 | 4.084817 | 0.603049 | 0.904780 | 0.048642 | 0.447706 | 0.201237 | 0.130062 | 1.844964 |
| 128 | Cambodia | 129 | 4.168 | 4.278518 | 4.057483 | 0.601765 | 1.006238 | 0.429783 | 0.633376 | 0.385923 | 0.068106 | 1.042941 |
| 129 | Sudan | 130 | 4.139 | 4.345747 | 3.932253 | 0.659517 | 1.214009 | 0.290921 | 0.014996 | 0.182317 | 0.089848 | 1.687066 |
| 130 | Ghana | 131 | 4.120 | 4.222707 | 4.017293 | 0.667225 | 0.873665 | 0.295638 | 0.423026 | 0.256924 | 0.025336 | 1.577868 |
| 131 | Ukraine | 132 | 4.096 | 4.185410 | 4.006590 | 0.894652 | 1.394538 | 0.575904 | 0.122975 | 0.270061 | 0.023029 | 0.814382 |
| 132 | Uganda | 133 | 4.081 | 4.195800 | 3.966200 | 0.381431 | 1.129828 | 0.217633 | 0.443186 | 0.325766 | 0.057070 | 1.526363 |
| 133 | Burkina Faso | 134 | 4.032 | 4.124059 | 3.939941 | 0.350228 | 1.043280 | 0.215844 | 0.324368 | 0.250865 | 0.120328 | 1.727213 |
| 134 | Niger | 135 | 4.028 | 4.111947 | 3.944053 | 0.161925 | 0.993025 | 0.268505 | 0.363659 | 0.228674 | 0.138573 | 1.873983 |
| 135 | Malawi | 136 | 3.970 | 4.077479 | 3.862521 | 0.233442 | 0.512569 | 0.315090 | 0.466915 | 0.287170 | 0.072712 | 2.081786 |
| 136 | Chad | 137 | 3.936 | 4.034712 | 3.837289 | 0.438013 | 0.953856 | 0.041135 | 0.162342 | 0.216114 | 0.053582 | 2.071238 |
| 137 | Zimbabwe | 138 | 3.875 | 3.978700 | 3.771300 | 0.375847 | 1.083096 | 0.196764 | 0.336384 | 0.189143 | 0.095375 | 1.597970 |
| 138 | Lesotho | 139 | 3.808 | 4.044344 | 3.571656 | 0.521021 | 1.190095 | 0.000000 | 0.390661 | 0.157497 | 0.119095 | 1.429835 |
| 139 | Angola | 140 | 3.795 | 3.951642 | 3.638358 | 0.858428 | 1.104412 | 0.049869 | 0.000000 | 0.097926 | 0.069720 | 1.614482 |
| 140 | Afghanistan | 141 | 3.794 | 3.873661 | 3.714338 | 0.401477 | 0.581543 | 0.180747 | 0.106180 | 0.311871 | 0.061158 | 2.150801 |
| 141 | Botswana | 142 | 3.766 | 3.874123 | 3.657877 | 1.122094 | 1.221555 | 0.341756 | 0.505196 | 0.099348 | 0.098583 | 0.377914 |
| 142 | Benin | 143 | 3.657 | 3.745784 | 3.568217 | 0.431085 | 0.435300 | 0.209930 | 0.425963 | 0.207948 | 0.060929 | 1.885631 |
| 143 | Madagascar | 144 | 3.644 | 3.714319 | 3.573681 | 0.305809 | 0.913020 | 0.375223 | 0.189197 | 0.208733 | 0.067232 | 1.584613 |
| 144 | Haiti | 145 | 3.603 | 3.734715 | 3.471285 | 0.368610 | 0.640450 | 0.277321 | 0.030370 | 0.489204 | 0.099872 | 1.697168 |
| 145 | Yemen | 146 | 3.593 | 3.692750 | 3.493250 | 0.591683 | 0.935382 | 0.310081 | 0.249464 | 0.104125 | 0.056767 | 1.345601 |
| 146 | South Sudan | 147 | 3.591 | 3.725539 | 3.456462 | 0.397249 | 0.601323 | 0.163486 | 0.147062 | 0.285671 | 0.116794 | 1.879567 |
| 147 | Liberia | 148 | 3.533 | 3.653756 | 3.412244 | 0.119042 | 0.872118 | 0.229918 | 0.332881 | 0.266550 | 0.038948 | 1.673286 |
| 148 | Guinea | 149 | 3.507 | 3.584428 | 3.429572 | 0.244550 | 0.791245 | 0.194129 | 0.348588 | 0.264815 | 0.110938 | 1.552312 |
| 149 | Togo | 150 | 3.495 | 3.594038 | 3.395962 | 0.305445 | 0.431883 | 0.247106 | 0.380426 | 0.196896 | 0.095665 | 1.837229 |
| 150 | Rwanda | 151 | 3.471 | 3.543030 | 3.398970 | 0.368746 | 0.945707 | 0.326425 | 0.581844 | 0.252756 | 0.455220 | 0.540061 |
| 151 | Syria | 152 | 3.462 | 3.663669 | 3.260331 | 0.777153 | 0.396103 | 0.500533 | 0.081539 | 0.493664 | 0.151347 | 1.061574 |
| 152 | Tanzania | 153 | 3.349 | 3.461430 | 3.236570 | 0.511136 | 1.041990 | 0.364509 | 0.390018 | 0.354256 | 0.066035 | 0.621130 |
| 153 | Burundi | 154 | 2.905 | 3.074690 | 2.735310 | 0.091623 | 0.629794 | 0.151611 | 0.059901 | 0.204435 | 0.084148 | 1.683024 |
| 154 | Central African Republic | 155 | 2.693 | 2.864884 | 2.521116 | 0.000000 | 0.000000 | 0.018773 | 0.270842 | 0.280876 | 0.056565 | 2.066005 |
155 rows × 12 columns
xxxxxxxxxx4. Set Index and Sort Index in Ascending Order¶
# Set new index to Countryhappiness_df = happiness_df.set_index("Country")# Sort the new dataframe alphabeticallyhappiness_df.sort_index(inplace=True)happiness_df.head(30)| Happiness.Rank | Happiness.Score | Whisker.high | Whisker.low | Economy..GDP.per.Capita. | Family | Health..Life.Expectancy. | Freedom | Generosity | Trust..Government.Corruption. | Dystopia.Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Country | |||||||||||
| Afghanistan | 141 | 3.794 | 3.873661 | 3.714338 | 0.401477 | 0.581543 | 0.180747 | 0.106180 | 0.311871 | 0.061158 | 2.150801 |
| Albania | 109 | 4.644 | 4.752464 | 4.535536 | 0.996193 | 0.803685 | 0.731160 | 0.381499 | 0.201313 | 0.039864 | 1.490442 |
| Algeria | 53 | 5.872 | 5.978286 | 5.765714 | 1.091864 | 1.146217 | 0.617585 | 0.233336 | 0.069437 | 0.146096 | 2.567604 |
| Angola | 140 | 3.795 | 3.951642 | 3.638358 | 0.858428 | 1.104412 | 0.049869 | 0.000000 | 0.097926 | 0.069720 | 1.614482 |
| Argentina | 24 | 6.599 | 6.690085 | 6.507915 | 1.185295 | 1.440451 | 0.695137 | 0.494519 | 0.109457 | 0.059740 | 2.614005 |
| Armenia | 121 | 4.376 | 4.466735 | 4.285265 | 0.900597 | 1.007484 | 0.637524 | 0.198303 | 0.083488 | 0.026674 | 1.521499 |
| Australia | 10 | 7.284 | 7.356651 | 7.211349 | 1.484415 | 1.510042 | 0.843887 | 0.601607 | 0.477699 | 0.301184 | 2.065211 |
| Austria | 13 | 7.006 | 7.070670 | 6.941330 | 1.487097 | 1.459945 | 0.815328 | 0.567766 | 0.316472 | 0.221060 | 2.138506 |
| Azerbaijan | 85 | 5.234 | 5.299287 | 5.168714 | 1.153602 | 1.152400 | 0.540776 | 0.398156 | 0.045269 | 0.180988 | 1.762482 |
| Bahrain | 41 | 6.087 | 6.178989 | 5.995011 | 1.488412 | 1.323110 | 0.653133 | 0.536747 | 0.172668 | 0.257042 | 1.656149 |
| Bangladesh | 110 | 4.608 | 4.689822 | 4.526178 | 0.586683 | 0.735132 | 0.533241 | 0.478357 | 0.172255 | 0.123718 | 1.978736 |
| Belarus | 67 | 5.569 | 5.646114 | 5.491885 | 1.156558 | 1.444945 | 0.637714 | 0.295400 | 0.155138 | 0.156314 | 1.723233 |
| Belgium | 17 | 6.891 | 6.955821 | 6.826179 | 1.463781 | 1.462313 | 0.818092 | 0.539771 | 0.231503 | 0.251343 | 2.124210 |
| Belize | 50 | 5.956 | 6.197242 | 5.714757 | 0.907975 | 1.081418 | 0.450192 | 0.547509 | 0.240016 | 0.096581 | 2.631956 |
| Benin | 143 | 3.657 | 3.745784 | 3.568217 | 0.431085 | 0.435300 | 0.209930 | 0.425963 | 0.207948 | 0.060929 | 1.885631 |
| Bhutan | 97 | 5.011 | 5.079335 | 4.942666 | 0.885416 | 1.340127 | 0.495879 | 0.501538 | 0.474055 | 0.173380 | 1.140184 |
| Bolivia | 58 | 5.823 | 5.903977 | 5.742023 | 0.833757 | 1.227619 | 0.473630 | 0.558733 | 0.225561 | 0.060478 | 2.443279 |
| Bosnia and Herzegovina | 90 | 5.182 | 5.276336 | 5.087665 | 0.982409 | 1.069336 | 0.705186 | 0.204403 | 0.328867 | 0.000000 | 1.892173 |
| Botswana | 142 | 3.766 | 3.874123 | 3.657877 | 1.122094 | 1.221555 | 0.341756 | 0.505196 | 0.099348 | 0.098583 | 0.377914 |
| Brazil | 22 | 6.635 | 6.725470 | 6.544531 | 1.107353 | 1.431306 | 0.616552 | 0.437454 | 0.162350 | 0.111093 | 2.769267 |
| Bulgaria | 105 | 4.714 | 4.803695 | 4.624306 | 1.161459 | 1.434379 | 0.708218 | 0.289232 | 0.113178 | 0.011052 | 0.996139 |
| Burkina Faso | 134 | 4.032 | 4.124059 | 3.939941 | 0.350228 | 1.043280 | 0.215844 | 0.324368 | 0.250865 | 0.120328 | 1.727213 |
| Burundi | 154 | 2.905 | 3.074690 | 2.735310 | 0.091623 | 0.629794 | 0.151611 | 0.059901 | 0.204435 | 0.084148 | 1.683024 |
| Cambodia | 129 | 4.168 | 4.278518 | 4.057483 | 0.601765 | 1.006238 | 0.429783 | 0.633376 | 0.385923 | 0.068106 | 1.042941 |
| Cameroon | 107 | 4.695 | 4.796541 | 4.593459 | 0.564305 | 0.946018 | 0.132892 | 0.430389 | 0.236298 | 0.051307 | 2.333646 |
| Canada | 7 | 7.316 | 7.384403 | 7.247597 | 1.479204 | 1.481349 | 0.834558 | 0.611101 | 0.435540 | 0.287372 | 2.187264 |
| Central African Republic | 155 | 2.693 | 2.864884 | 2.521116 | 0.000000 | 0.000000 | 0.018773 | 0.270842 | 0.280876 | 0.056565 | 2.066005 |
| Chad | 137 | 3.936 | 4.034712 | 3.837289 | 0.438013 | 0.953856 | 0.041135 | 0.162342 | 0.216114 | 0.053582 | 2.071238 |
| Chile | 20 | 6.652 | 6.739251 | 6.564749 | 1.252785 | 1.284025 | 0.819480 | 0.376895 | 0.326662 | 0.082288 | 2.509586 |
| China | 79 | 5.273 | 5.319278 | 5.226721 | 1.081166 | 1.160837 | 0.741416 | 0.472788 | 0.028807 | 0.022794 | 1.764939 |
xxxxxxxxxx5. Select the Happiness Score for Vietnam¶
# Grab the data contained within the row of a country of your choice and the "Happiness.Score" column# Print the result as "<country's name> happiness score for 2017 was <happiness score>"happy_score = happiness_df.loc["Vietnam", "Happiness.Score"]print("Vietnam's happiness score for 2017 was " + str(happy_score))Vietnam's happiness score for 2017 was 5.07399988174438
xxxxxxxxxx6. Select the Happiness Rank and Score for Belgium, Belize, Benin, Bhutan, and Bolivia¶
# Grab all rows of data from "Belgium" to "Bolivia" and the columns from "Happiness.Rank" to "Happiness.Score"# Hint - in order to view more data you will need to increase the dataframe's headbelgium_to_bolivia = happiness_df.loc[["Belgium", "Belize", "Benin", "Bhutan", "Bolivia"], ["Happiness.Rank", "Happiness.Score"]]belgium_to_bolivia| Happiness.Rank | Happiness.Score | |
|---|---|---|
| Country | ||
| Belgium | 17 | 6.891 |
| Belize | 50 | 5.956 |
| Benin | 143 | 3.657 |
| Bhutan | 97 | 5.011 |
| Bolivia | 58 | 5.823 |
xxxxxxxxxx7. Select the Freedom and Government Trust for Every Country¶
# Select all rows and only display the columns for "Freedom" and "Trust..Government.Corruption." happiness_df.loc[:, ["Freedom", "Trust..Government.Corruption."]].head()| Freedom | Trust..Government.Corruption. | |
|---|---|---|
| Country | ||
| Afghanistan | 0.106180 | 0.061158 |
| Albania | 0.381499 | 0.039864 |
| Algeria | 0.233336 | 0.146096 |
| Angola | 0.000000 | 0.069720 |
| Argentina | 0.494519 | 0.059740 |
xxxxxxxxxx8. Select all the Countries Where GDP is Greater than or Equal to 1.6¶
# Select all rows where the GDP per capita has a value greater than or equal to 1.6gdp_score = happiness_df.loc[happiness_df["Economy..GDP.per.Capita."] >= 1.6, :]gdp_score.head()| Happiness.Rank | Happiness.Score | Whisker.high | Whisker.low | Economy..GDP.per.Capita. | Family | Health..Life.Expectancy. | Freedom | Generosity | Trust..Government.Corruption. | Dystopia.Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Country | |||||||||||
| Kuwait | 39 | 6.105 | 6.191957 | 6.018043 | 1.632952 | 1.259699 | 0.632106 | 0.496338 | 0.228290 | 0.215160 | 1.640425 |
| Luxembourg | 18 | 6.863 | 6.923686 | 6.802314 | 1.741944 | 1.457584 | 0.845089 | 0.596628 | 0.283181 | 0.318834 | 1.619512 |
| Norway | 1 | 7.537 | 7.594445 | 7.479556 | 1.616463 | 1.533524 | 0.796667 | 0.635423 | 0.362012 | 0.315964 | 2.277027 |
| Qatar | 35 | 6.375 | 6.568477 | 6.181523 | 1.870766 | 1.274297 | 0.710098 | 0.604131 | 0.330474 | 0.439299 | 1.145464 |
| Singapore | 26 | 6.572 | 6.636723 | 6.507277 | 1.692278 | 1.353814 | 0.949492 | 0.549841 | 0.345966 | 0.464308 | 1.216362 |
xxxxxxxxxx9. Select all the Countries Where the Happiness Score is Greater than or Equal to 5 and GDP is Less than or Equal to 1¶
# Select all rows that have a happiness score greater than or equal 5 and a GDP score less than or equal 1happy_but_poor_score = happiness_df.loc[(happiness_df["Happiness.Score"] >= 5) & (happiness_df["Economy..GDP.per.Capita."] <= 1), :]happy_but_poor_score.head()| Happiness.Rank | Happiness.Score | Whisker.high | Whisker.low | Economy..GDP.per.Capita. | Family | Health..Life.Expectancy. | Freedom | Generosity | Trust..Government.Corruption. | Dystopia.Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Country | |||||||||||
| Belize | 50 | 5.956 | 6.197242 | 5.714757 | 0.907975 | 1.081418 | 0.450192 | 0.547509 | 0.240016 | 0.096581 | 2.631956 |
| Bhutan | 97 | 5.011 | 5.079335 | 4.942666 | 0.885416 | 1.340127 | 0.495879 | 0.501538 | 0.474055 | 0.173380 | 1.140184 |
| Bolivia | 58 | 5.823 | 5.903977 | 5.742023 | 0.833757 | 1.227619 | 0.473630 | 0.558733 | 0.225561 | 0.060478 | 2.443279 |
| Bosnia and Herzegovina | 90 | 5.182 | 5.276336 | 5.087665 | 0.982409 | 1.069336 | 0.705186 | 0.204403 | 0.328867 | 0.000000 | 1.892173 |
| El Salvador | 45 | 6.003 | 6.108635 | 5.897364 | 0.909784 | 1.182125 | 0.596019 | 0.432453 | 0.078258 | 0.089981 | 2.714594 |
xxxxxxxxxxComposing Masterpieces¶
xxxxxxxxxxYou've been offered the opportunity to interview with REMAX, an international real estate company. The interview is for a Data Analyst role, which requires both development and visualization skills. All applicants are required to present a portfolio upon interview. You currently do not have a portfolio, so use the provided state loan data to come up with some snazzy visualizations to showcase in the interview.
import pandas as pdimport hvplot.pandasfrom pathlib import PathxxxxxxxxxxPrep the data¶
# Read in loan dataloan_data = pd.read_csv(Path("../Resources/state_loan_data.csv"))# Group data by state for state-level analysisloan_data_state = loan_data.groupby('State Code').sum()loan_data_state.head()| Total Active Loans | Total Average Loan Amount | 2015 - 2016 | 2010 - 2014 | Self Help Loan | Leveraged Loan | |
|---|---|---|---|---|---|---|
| State Code | ||||||
| AK | 897 | 148811.39 | 189583.49 | 172371.39 | 96 | 450 |
| AL | 7455 | 435054.53 | 686066.79 | 648663.44 | 18 | 605 |
| AR | 6186 | 263212.27 | 355445.70 | 351562.42 | 861 | 567 |
| AZ | 3982 | 597591.94 | 507532.91 | 599688.64 | 1718 | 1057 |
| CA | 8875 | 2805653.85 | 3008866.40 | 2881741.23 | 4134 | 2231 |
loan_data_state.head()| Total Active Loans | Total Average Loan Amount | 2015 - 2016 | 2010 - 2014 | Self Help Loan | Leveraged Loan | |
|---|---|---|---|---|---|---|
| State Code | ||||||
| AK | 897 | 148811.39 | 189583.49 | 172371.39 | 96 | 450 |
| AL | 7455 | 435054.53 | 686066.79 | 648663.44 | 18 | 605 |
| AR | 6186 | 263212.27 | 355445.70 | 351562.42 | 861 | 567 |
| AZ | 3982 | 597591.94 | 507532.91 | 599688.64 | 1718 | 1057 |
| CA | 8875 | 2805653.85 | 3008866.40 | 2881741.23 | 4134 | 2231 |
xxxxxxxxxxPlot Total Average Loan Amount¶
# Slice for Total Average Loan Amountloan_data_state = loan_data['Total Average Loan Amount']loan_data_avg_grp = loan_data_state.sort_values()loan_data_avg_grp.index.names = ['State Code 0']# Plot Total Average Loan Amountplot_state_avgs = loan_data_avg_grp.hvplot.bar(label='2019 Total Average Amount')plot_state_avgsxxxxxxxxxxPlot Total Average Loan Amount for 2015 - 2016 and 2010 - 2014¶
# Slice data for Total Average Loan Amount by 2015-2016 and 2010-2014 date rangesloan_data_range_1 = loan_data['2015 - 2016']loan_data_range_2 = loan_data['2010 - 2014']loan_data_range_grp = loan_data_range_1.sort_values()loan_data_range_grp_2 = loan_data_range_2.sort_values()# Recall, that the + operator aligns the index of both dataframes by default. We prevent this by renaming the indexesloan_data_range_grp.index.names = ['State Code 1']loan_data_range_grp_2.index.names = ['State Code 2']# Plot data for date rangesplot_2015_2016 = loan_data_range_grp.hvplot(label='2015 - 2016') plot_2010_2014 = loan_data_range_grp_2.hvplot(label='2010 - 2014')xxxxxxxxxxCompose plots for 2015 - 2016 and 2010 - 2014 using + operator¶
xxxxxxxxxxCompose plots for state averages, 2015 - 2016, and 2010 2014 using + operator¶
# Compose plotsplot_state_avgs + plot_2015_2016 + plot_2010_2014xxxxxxxxxxCompose plots for state averages, 2015 - 2016, and 2010 2014 using * operator¶
Notice that we are using the original sorting of each column for this graph (which was alphabetical)
loan_data['Total Average Loan Amount'].hvplot.bar(label='2019 Total Average Amount') * loan_data['2015 - 2016'].hvplot(label='2015 - 2016') * loan_data['2010 - 2014'].hvplot(label='2010 - 2014') xxxxxxxxxxPlotting with Plotly¶
xxxxxxxxxxFor the second round of interviews, REMAX is requesting that applicants create interactive plots using Plotly Express instead of hvPlot. Create two scatter plots using the provided foreclosure data.
x
import plotly.express as pximport pandas as pdfrom pathlib import PathxxxxxxxxxxPlot Foreclosures¶
# Read in dataforeclosures = pd.read_csv( Path("../../Resources/alleghany_foreclosures.csv"), infer_datetime_format=True, parse_dates=True, index_col="filing_date",)# Slice data and groupforeclosures_grp = ( foreclosures[["municipality", "amount"]] .groupby([foreclosures.index.year, "municipality"]) .count() .reset_index())foreclosures_grp.head()| filing_date | municipality | amount | |
|---|---|---|---|
| 0 | 2009 | Aleppo Township | 5 |
| 1 | 2009 | Aspinwall Boro | 4 |
| 2 | 2009 | Avalon Boro | 22 |
| 3 | 2009 | Baldwin Boro | 47 |
| 4 | 2009 | Baldwin Township | 11 |
xxxxxxxxxxPlot relationship between municipality and the number of foreclosures¶
# Create scatter plotpx.scatter(foreclosures_grp, x="municipality", y="amount", color="filing_date")xxxxxxxxxxPlot the relationship between filing date and municipality¶
# Create scatter plotpx.scatter(foreclosures_grp, x="municipality", y="filing_date", size="amount")xxxxxxxxxxPlot the progression of foreclosures¶
# Create area plotpx.area( foreclosures_grp, x="filing_date", y="amount", color="municipality", line_group="municipality",)- financial_forecasting.ipynb
- portfolio_forecasting.ipynb
- risky_business.ipynb
- composing_masterpieces.ipynb
- plotting_w_plotly.ipynb
- culmination-03.ipynb
- indexing_fever.ipynb
xxxxxxxxxxDemo: Forcasting Stock Performance in Our Portfolio¶
This program retrieves stock price data using the Alpaca API and then simulates future performance using Monte Carlo Simulation.
xxxxxxxxxxImport Dependencies¶
# Import libraries and dependenciesimport osimport pandas as pdimport alpaca_trade_api as tradeapifrom MCForecastTools import MCSimulation# Load .env enviroment variablesfrom dotenv import load_dotenvload_dotenv()True
# Set Alpaca API key and secretalpaca_api_key = os.getenv("ALPACA_API_KEY")alpaca_secret_key = os.getenv("ALPACA_SECRET_KEY")api = tradeapi.REST( alpaca_api_key, alpaca_secret_key, api_version = "v2")xxxxxxxxxxGet Past 3 Year's Worth of Microsoft and Coca-Cola Stock Price Data via Alpaca API Call¶
# Set timeframe to '1D'timeframe = "1D"# Set start and end datetimes between now and 3 years ago.start_date = pd.Timestamp("2017-05-01", tz="America/New_York").isoformat()end_date = pd.Timestamp("2020-05-01", tz="America/New_York").isoformat()# Set the ticker informationtickers = ["MSFT","KO"]# Get 3 year's worth of historical price data for Microsoft and Coca-Coladf_ticker = api.get_barset( tickers, timeframe, start=start_date, end=end_date, limit=1000,).df# Display sample datadf_ticker.head()| KO | MSFT | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| open | high | low | close | volume | open | high | low | close | volume | |
| 2017-05-01 00:00:00-04:00 | 43.15 | 43.500 | 43.10 | 43.230 | 11470244 | 68.67 | 69.55 | 68.500 | 69.43 | 27119635 |
| 2017-05-02 00:00:00-04:00 | 43.22 | 43.440 | 43.16 | 43.380 | 9733052 | 69.71 | 69.71 | 69.130 | 69.30 | 19179268 |
| 2017-05-03 00:00:00-04:00 | 43.33 | 43.485 | 43.15 | 43.315 | 8712620 | 69.38 | 69.38 | 68.710 | 69.08 | 24576064 |
| 2017-05-04 00:00:00-04:00 | 43.42 | 43.810 | 43.40 | 43.660 | 8638866 | 69.03 | 69.08 | 68.640 | 68.81 | 18561598 |
| 2017-05-05 00:00:00-04:00 | 43.79 | 43.840 | 43.56 | 43.680 | 6678133 | 68.90 | 69.03 | 68.485 | 68.99 | 15644971 |
xxxxxxxxxxThe MCForecastTools Library at a Glance¶
# Print the documentation of the MCSimulation module of the MCForecastTools library?MCSimulationInit signature: MCSimulation( portfolio_data, weights='', num_simulation=1000, num_trading_days=252, ) Docstring: A Python class for runnning Monte Carlo simulation on portfolio price data. ... Attributes ---------- portfolio_data : pandas.DataFrame portfolio dataframe weights: list(float) portfolio investment breakdown nSim: int number of samples in simulation nTrading: int number of trading days to simulate simulated_return : pandas.DataFrame Simulated data from Monte Carlo confidence_interval : pandas.Series the 95% confidence intervals for simulated final cumulative returns Init docstring: Constructs all the necessary attributes for the MCSimulation object. Parameters ---------- portfolio_data: pandas.DataFrame DataFrame containing stock price information from Alpaca API weights: list(float) A list fractions representing percentage of total investment per stock. DEFAULT: Equal distribution num_simulation: int Number of simulation samples. DEFAULT: 1000 simulation samples num_trading_days: int Number of trading days to simulate. DEFAULT: 252 days (1 year of business days) File: ~/Code/FinTech-Lesson-Plans/01-Lesson-Plans/05-APIs/3/Activities/03-Ins_Portfolio_Forecasting_Monte_Carlo/Solved/MCForecastTools.py Type: type Subclasses:
xxxxxxxxxxSimulate five year portfolio growth using Monte Carlo simulation¶
# Configuring a Monte Carlo simulation to forecast five years cumulative returnsMC_fiveyear = MCSimulation( portfolio_data = df_ticker, weights = [.60,.40], num_simulation = 500, num_trading_days = 252*5))# Printing the simulation input dataMC_fiveyear.portfolio_data.head()| KO | MSFT | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| open | high | low | close | volume | daily_return | open | high | low | close | volume | daily_return | |
| 2017-05-01 00:00:00-04:00 | 43.15 | 43.500 | 43.10 | 43.230 | 11470244 | NaN | 68.67 | 69.55 | 68.500 | 69.43 | 27119635 | NaN |
| 2017-05-02 00:00:00-04:00 | 43.22 | 43.440 | 43.16 | 43.380 | 9733052 | 0.003470 | 69.71 | 69.71 | 69.130 | 69.30 | 19179268 | -0.001872 |
| 2017-05-03 00:00:00-04:00 | 43.33 | 43.485 | 43.15 | 43.315 | 8712620 | -0.001498 | 69.38 | 69.38 | 68.710 | 69.08 | 24576064 | -0.003175 |
| 2017-05-04 00:00:00-04:00 | 43.42 | 43.810 | 43.40 | 43.660 | 8638866 | 0.007965 | 69.03 | 69.08 | 68.640 | 68.81 | 18561598 | -0.003909 |
| 2017-05-05 00:00:00-04:00 | 43.79 | 43.840 | 43.56 | 43.680 | 6678133 | 0.000458 | 68.90 | 69.03 | 68.485 | 68.99 | 15644971 | 0.002616 |
# Running a Monte Carlo simulation to forecast five years cumulative returnsMC_fiveyear.calc_cumulative_return()Running Monte Carlo simulation number 0. Running Monte Carlo simulation number 10. Running Monte Carlo simulation number 20. Running Monte Carlo simulation number 30. Running Monte Carlo simulation number 40. Running Monte Carlo simulation number 50. Running Monte Carlo simulation number 60. Running Monte Carlo simulation number 70. Running Monte Carlo simulation number 80. Running Monte Carlo simulation number 90. Running Monte Carlo simulation number 100. Running Monte Carlo simulation number 110. Running Monte Carlo simulation number 120. Running Monte Carlo simulation number 130. Running Monte Carlo simulation number 140. Running Monte Carlo simulation number 150. Running Monte Carlo simulation number 160. Running Monte Carlo simulation number 170. Running Monte Carlo simulation number 180. Running Monte Carlo simulation number 190. Running Monte Carlo simulation number 200. Running Monte Carlo simulation number 210. Running Monte Carlo simulation number 220. Running Monte Carlo simulation number 230. Running Monte Carlo simulation number 240. Running Monte Carlo simulation number 250. Running Monte Carlo simulation number 260. Running Monte Carlo simulation number 270. Running Monte Carlo simulation number 280. Running Monte Carlo simulation number 290. Running Monte Carlo simulation number 300. Running Monte Carlo simulation number 310. Running Monte Carlo simulation number 320. Running Monte Carlo simulation number 330. Running Monte Carlo simulation number 340. Running Monte Carlo simulation number 350. Running Monte Carlo simulation number 360. Running Monte Carlo simulation number 370. Running Monte Carlo simulation number 380. Running Monte Carlo simulation number 390. Running Monte Carlo simulation number 400. Running Monte Carlo simulation number 410. Running Monte Carlo simulation number 420. Running Monte Carlo simulation number 430. Running Monte Carlo simulation number 440. Running Monte Carlo simulation number 450. Running Monte Carlo simulation number 460. Running Monte Carlo simulation number 470. Running Monte Carlo simulation number 480. Running Monte Carlo simulation number 490.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 490 | 491 | 492 | 493 | 494 | 495 | 496 | 497 | 498 | 499 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
| 1 | 1.011183 | 1.003640 | 0.995387 | 1.003003 | 1.002283 | 0.991261 | 0.993851 | 1.004959 | 0.995040 | 1.005974 | ... | 0.987744 | 0.999161 | 1.001092 | 1.015067 | 0.991717 | 0.993505 | 1.005455 | 0.992440 | 1.014116 | 0.996327 |
| 2 | 1.010516 | 1.013914 | 0.986976 | 0.994755 | 0.988325 | 0.997444 | 0.982326 | 1.005981 | 1.001946 | 1.013416 | ... | 0.988858 | 1.002668 | 1.005572 | 1.002438 | 1.004596 | 0.979046 | 1.003859 | 1.007666 | 1.025195 | 0.991248 |
| 3 | 1.020538 | 1.014956 | 0.995326 | 0.971891 | 1.002391 | 1.008548 | 0.998577 | 1.009823 | 1.023618 | 1.006224 | ... | 1.007736 | 1.009974 | 1.001745 | 1.003470 | 1.013095 | 0.996308 | 1.013724 | 1.001729 | 1.010593 | 0.994025 |
| 4 | 1.008357 | 1.015980 | 0.990913 | 0.967137 | 0.995449 | 1.011326 | 1.007435 | 1.013989 | 1.032095 | 0.995030 | ... | 0.998612 | 1.015970 | 0.978090 | 1.026153 | 1.020178 | 1.001752 | 1.022909 | 0.996071 | 1.014151 | 0.999556 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1256 | 2.320943 | 3.625409 | 3.503447 | 1.085580 | 1.246446 | 1.439788 | 1.434845 | 2.251037 | 2.848876 | 1.858250 | ... | 3.062218 | 2.144138 | 1.879846 | 2.887162 | 1.406283 | 4.133657 | 1.443959 | 1.986593 | 1.575785 | 4.102522 |
| 1257 | 2.298058 | 3.670266 | 3.450661 | 1.093163 | 1.257378 | 1.437472 | 1.410902 | 2.249188 | 2.823875 | 1.860901 | ... | 3.094995 | 2.138825 | 1.869621 | 2.874659 | 1.434135 | 4.148609 | 1.428483 | 1.990941 | 1.552572 | 4.075057 |
| 1258 | 2.333132 | 3.619737 | 3.421655 | 1.104514 | 1.269939 | 1.423639 | 1.428474 | 2.263686 | 2.866994 | 1.865590 | ... | 3.090344 | 2.174377 | 1.872586 | 2.897630 | 1.441873 | 4.162761 | 1.412225 | 1.969911 | 1.548777 | 4.108532 |
| 1259 | 2.314822 | 3.675878 | 3.392552 | 1.120610 | 1.267853 | 1.433143 | 1.444003 | 2.287083 | 2.925508 | 1.881977 | ... | 3.114467 | 2.166647 | 1.836733 | 2.940865 | 1.454774 | 4.256929 | 1.408203 | 1.939488 | 1.550266 | 4.120576 |
| 1260 | 2.331860 | 3.727082 | 3.401183 | 1.110582 | 1.250750 | 1.448851 | 1.432627 | 2.318826 | 2.954205 | 1.878881 | ... | 3.066632 | 2.141222 | 1.838587 | 2.956516 | 1.449248 | 4.273349 | 1.436359 | 1.958227 | 1.557093 | 4.071631 |
1261 rows × 500 columns
# Plot simulation outcomesline_plot = MC_fiveyear.plot_simulation()# Save the plot for future usageline_plot.get_figure().savefig("MC_fiveyear_sim_plot.png", bbox_inches="tight")# Plot probability distribution and confidence intervalsdist_plot = MC_fiveyear.plot_distribution()# Save the plot for future usagedist_plot.get_figure().savefig('MC_fiveyear_dist_plot.png',bbox_inches='tight')# Fetch summary statistics from the Monte Carlo simulation resultstbl = MC_fiveyear.summarize_cumulative_return()# Print summary statisticsprint(tbl)count 500.000000 mean 2.325262 std 0.990804 min 0.775205 25% 1.606110 50% 2.129262 75% 2.787852 max 6.779987 95% CI Lower 0.987389 95% CI Upper 4.934141 Name: 1260, dtype: float64
# Use the lower and upper `95%` confidence intervals to calculate the range of the possible outcomes of our $10,000 investments in Coca-Cola and Microsoft stocksci_lower = round(tbl[8]*10000,2)ci_upper = round(tbl[9]*10000,2)# Print resultsprint(f"There is a 95% chance that an initial investment of $10,000 in the portfolio" f" over the next 5 years will end within in the range of" f" ${ci_lower} and ${ci_upper}")There is a 95% chance that an initial investment of $10,000 in the portfolio over the next 5 years will end within in the range of $9873.89 and $49341.41
xxxxxxxxxxImport Libraries and Dependencies¶
import pandas as pdfrom pathlib import PathxxxxxxxxxxRead CSV as Pandas DataFrame¶
# Read csv datacsv_path = Path("../../Resources/goog_google_finance.csv")goog_df = pd.read_csv(csv_path, parse_dates=True, index_col="Date", infer_datetime_format=True)goog_df.head()| Close | |
|---|---|
| Date | |
| 2019-05-13 | 1132.03 |
| 2019-05-10 | 1164.27 |
| 2019-05-09 | 1162.38 |
| 2019-05-08 | 1166.27 |
| 2019-05-07 | 1174.10 |
xxxxxxxxxxAssess & Clean Data¶
# Check for nullsgoog_df.isnull().mean() * 100# Drop nullsgoog_df = goog_df.dropna()# Drop duplicatesgoog_df = goog_df.drop_duplicates()# Validate no more missing valuesgoog_df.isnull().sum()Close 0 dtype: int64
xxxxxxxxxxGroup by year and month¶
# Set multi-index by groupinggoog_df_grp = goog_df.groupby([goog_df.index.year, goog_df.index.month]).first()goog_df_grp.head()| Close | ||
|---|---|---|
| Date | Date | |
| 2019 | 2 | 1119.92 |
| 3 | 1173.31 | |
| 4 | 1188.48 | |
| 5 | 1132.03 |
xxxxxxxxxxAccess Close for May 2019 Using Multi-indexing Lookup¶
# Select GOOG Close for May 2019google_may_2019_data = goog_df_grp.loc[2019, 5]google_may_2019_dataClose 1132.03 Name: (2019, 5), dtype: float64
xxxxxxxxxxCalculate mean Close Price of GOOG for 2019¶
goog_mean_may_2019 = goog_df_grp.loc[2019].mean()goog_mean_may_2019Close 1153.435 dtype: float64
xxxxxxxxxxRisky Business¶
xxxxxxxxxxHarold has been boasting about his recent cryptocurrency returns. He's even bet you five bucks that his trading algorithm has made his returns better than yours. Calculate the standard deviation and sharpe ratio for each your portfolios to determine which portfolio is the better investment based off of the greatest risk to reward ratio.
import pandas as pdimport numpy as npfrom pathlib import Path%matplotlib inlinexxxxxxxxxxRead in data¶
harold_portfolio_path = Path("../../Resources/harold_portfolio.csv")my_portfolio_path = Path("../../Resources/my_portfolio.csv")# Read in dataharold_portfolio = pd.read_csv(harold_portfolio_path, parse_dates=True, infer_datetime_format=True)my_portfolio = pd.read_csv(my_portfolio_path, parse_dates=True, infer_datetime_format=True)# Drop naharold_portfolio = harold_portfolio.dropna()my_portfolio = my_portfolio.dropna()# Set Date as indexharold_portfolio = harold_portfolio.set_index('Date')my_portfolio = my_portfolio.set_index('Date')harold_portfolio.head()| BTC | BTT | DOGE | ETH | LTC | XRP | SOLVE | XLM | TRON | ZIL | |
|---|---|---|---|---|---|---|---|---|---|---|
| Date | ||||||||||
| 2019-05-20 | 8118.39 | 0.001187 | 0.003041 | 261.09 | 95.06 | 0.398003 | 0.338667 | 0.135545 | 0.028509 | 0.017992 |
| 2019-05-19 | 8200.00 | 0.001056 | 0.003178 | 234.03 | 86.65 | 0.417700 | 0.300430 | 0.141485 | 0.029277 | 0.019170 |
| 2019-05-18 | 7262.40 | 0.000993 | 0.003005 | 244.59 | 89.88 | 0.372736 | 0.287630 | 0.128833 | 0.026823 | 0.017685 |
| 2019-05-17 | 7363.69 | 0.000957 | 0.002962 | 264.39 | 95.75 | 0.386193 | 0.259506 | 0.126562 | 0.027220 | 0.017892 |
| 2019-05-16 | 7878.96 | 0.000885 | 0.003168 | 249.00 | 102.26 | 0.419707 | 0.223316 | 0.137283 | 0.028857 | 0.018725 |
xxxxxxxxxxCalculate daily returns¶
# Calculate daily returns for each portfolioharold_returns = harold_portfolio.pct_change().dropna()my_returns = my_portfolio.pct_change().dropna()harold_returns.head()| BTC | BTT | DOGE | ETH | LTC | XRP | SOLVE | XLM | TRON | ZIL | |
|---|---|---|---|---|---|---|---|---|---|---|
| Date | ||||||||||
| 2019-05-19 | 0.010052 | -0.110362 | 0.045051 | -0.103642 | -0.088470 | 0.049490 | -0.112904 | 0.043823 | 0.026939 | 0.065474 |
| 2019-05-18 | -0.114341 | -0.059659 | -0.054437 | 0.045122 | 0.037276 | -0.107647 | -0.042606 | -0.089423 | -0.083820 | -0.077465 |
| 2019-05-17 | 0.013947 | -0.036254 | -0.014309 | 0.080952 | 0.065309 | 0.036103 | -0.097778 | -0.017627 | 0.014801 | 0.011705 |
| 2019-05-16 | 0.069974 | -0.075235 | 0.069548 | -0.058209 | 0.067990 | 0.086780 | -0.139457 | 0.084709 | 0.060140 | 0.046557 |
| 2019-05-15 | 0.041168 | -0.037288 | 0.044192 | -0.126747 | -0.103364 | 0.086484 | 0.007420 | -0.002047 | 0.086322 | 0.088224 |
xxxxxxxxxxConcat portfolios into one DataFrame¶
# Concat returns DataFramesall_returns = pd.concat([harold_returns,my_returns], axis='columns', join='inner')all_returns.head()| BTC | BTT | DOGE | ETH | LTC | XRP | SOLVE | XLM | TRON | ZIL | BTC | DOGE | ETH | LTC | XRP | XLM | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | ||||||||||||||||
| 2019-05-19 | 0.010052 | -0.110362 | 0.045051 | -0.103642 | -0.088470 | 0.049490 | -0.112904 | 0.043823 | 0.026939 | 0.065474 | -0.114341 | 0.045051 | -0.103642 | -0.088470 | 0.049490 | 0.043823 |
| 2019-05-18 | -0.114341 | -0.059659 | -0.054437 | 0.045122 | 0.037276 | -0.107647 | -0.042606 | -0.089423 | -0.083820 | -0.077465 | 0.013947 | -0.054437 | 0.045122 | 0.037276 | -0.107647 | -0.089423 |
| 2019-05-17 | 0.013947 | -0.036254 | -0.014309 | 0.080952 | 0.065309 | 0.036103 | -0.097778 | -0.017627 | 0.014801 | 0.011705 | 0.069974 | -0.014309 | 0.080952 | 0.065309 | 0.036103 | -0.017627 |
| 2019-05-16 | 0.069974 | -0.075235 | 0.069548 | -0.058209 | 0.067990 | 0.086780 | -0.139457 | 0.084709 | 0.060140 | 0.046557 | 0.041168 | 0.069548 | -0.058209 | 0.067990 | 0.086780 | 0.084709 |
| 2019-05-15 | 0.041168 | -0.037288 | 0.044192 | -0.126747 | -0.103364 | 0.086484 | 0.007420 | -0.002047 | 0.086322 | 0.088224 | -0.025892 | 0.044192 | -0.126747 | -0.103364 | 0.086484 | -0.002047 |
xxxxxxxxxxCalculate std dev for all data¶
# Calculate std devall_portfolio_std = all_returns.std()all_portfolio_std.head()BTC 0.070488 BTT 0.030731 DOGE 0.050907 ETH 0.091517 LTC 0.084703 dtype: float64
xxxxxxxxxxCalculate sharpe ratio for all data¶
# Calculate sharpe ratiosharpe_ratios = (all_returns.mean() * 252) / (all_portfolio_std * np.sqrt(252))sharpe_ratios.head()BTC 0.936891 BTT -32.935733 DOGE 5.615811 ETH -5.638309 LTC -0.796855 dtype: float64
xxxxxxxxxxPlot Sharpe Ratios¶
# Plotsharpe_ratios.plot.bar(title='Sharpe Ratios')<matplotlib.axes._subplots.AxesSubplot at 0x11c4abe10>
xxxxxxxxxxHow many smart investments did Harold make compared to risky investments? How many did you make?¶
xxxxxxxxxxOut of his 10 investments, Harold only made 4 good investments. Out of the students' 6 investments, 3 of them were smart investments.
xxxxxxxxxxWhich cryptos have been the smartest investment?¶
xxxxxxxxxxDOGE, TRON, and XML have been the smartest crypto investments.
xxxxxxxxxxCalculate annualized std dev for each portfolio individually¶
# Calculate std dev for all investments for each portfolioharold_std_annual = harold_returns.std() * np.sqrt(252)my_std_annual = my_returns.std() * np.sqrt(252)harold_std_annualBTC 1.118962 BTT 0.487842 DOGE 0.808117 ETH 1.452783 LTC 1.344613 XRP 1.274365 SOLVE 0.936847 XLM 1.044947 TRON 1.030614 ZIL 1.027496 dtype: float64
xxxxxxxxxxCalculate the sharpe ratios for each individual portfolio¶
# Calculate sharpe ratioharold_sharpe_ratios = (harold_returns.mean() * 252) / (harold_std_annual)my_sharpe_ratios = (my_returns.mean() * 252) / (my_std_annual)my_sharpe_ratios.head()BTC -0.666636 DOGE -0.020114 ETH -0.389330 LTC -0.285959 XRP -0.212338 dtype: float64
xxxxxxxxxxAverage sharpe ratios for each portfolio and output to screen¶
# Averaging sharpe ratiosharold_sharpe_avg = harold_sharpe_ratios.mean()my_sharpe_avg = my_sharpe_ratios.mean()print("Harold's sharpe ratio average:", harold_sharpe_avg)print("My sharpe ratio average:", my_sharpe_avg)Harold's sharpe ratio average: -3.4928380351166615 My sharpe ratio average: -0.27781543864562924
xxxxxxxxxxCompare Sharpe Ratios Averages¶
# Use comparison operator to determine if Harold's sharpe ratio avg is greaterharold_sharpe_avg > my_sharpe_avgFalse
xxxxxxxxxxWhich portfolio is the smartest investment, based off of risk to reward ratio?¶
xxxxxxxxxxMy portfolio!
xxxxxxxxxxStudent Activity: Portfolio Planner Part 2¶
This program is part 2 of a two-part student activity and continues the following:
PART I: Portfolio Optimization via Risk Evaluation
Reads in the CSV datasets of 10 stocks, calculates the volatility of each stock, drops the top-five highly volatile stocks, sets allocations for the remaining stocks based on risk/volatility, and calculates the returns of a hypothetical $10,000 investment for the constructed portfolio.
PART II: Portfolio Optimization via Correlation and Return-to-Risk (Sharpe Ratio) Evaluations
Alternatively filters the same 10 stocks (from part 1) down to only minimally-correlated stocks with higher sharpe ratios, and calculates the returns of a hypothetical $10,000 investment for the constructed optimized portfolio. Returns of the optimized portfolio are then compared to lesser-optimized portfolios.
xxxxxxxxxxxxxxxxxxxxPART I: Portfolio Optimization via Risk Evaluation¶
x
# Import libraries and dependenciesimport osfrom pathlib import Pathimport numpy as npimport pandas as pdimport seaborn as sns%matplotlib inlineimport matplotlib.pyplot as pltimport plotly.express as pxfrom MCForecastTools import MCSimulation/opt/anaconda3/lib/python3.7/site-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as tm
xxxxxxxxxxRead CSVs as DataFrames¶
# Set the file pathsbk_data = Path("../Resources/bk_data.csv")fang_data = Path("../Resources/fang_data.csv")jnj_data = Path("../Resources/jnj_data.csv")luv_data = Path("../Resources/luv_data.csv")mu_data = Path("../Resources/mu_data.csv")nke_data = Path("../Resources/nke_data.csv")sbux_data = Path("../Resources/sbux_data.csv")t_data = Path("../Resources/t_data.csv")wdc_data = Path("../Resources/wdc_data.csv")wrk_data = Path("../Resources/wrk_data.csv")# Read the CSVs and set the `date` column as a datetime index to the DataFramebk_df = pd.read_csv(bk_data, index_col="date", infer_datetime_format=True, parse_dates=True)fang_df = pd.read_csv(fang_data, index_col="date", infer_datetime_format=True, parse_dates=True)jnj_df = pd.read_csv(jnj_data, index_col="date", infer_datetime_format=True, parse_dates=True)luv_df = pd.read_csv(luv_data, index_col="date", infer_datetime_format=True, parse_dates=True)mu_df = pd.read_csv(mu_data, index_col="date", infer_datetime_format=True, parse_dates=True)nke_df = pd.read_csv(nke_data, index_col="date", infer_datetime_format=True, parse_dates=True)sbux_df = pd.read_csv(sbux_data, index_col="date", infer_datetime_format=True, parse_dates=True)t_df = pd.read_csv(t_data, index_col="date", infer_datetime_format=True, parse_dates=True)wdc_df = pd.read_csv(wdc_data, index_col="date", infer_datetime_format=True, parse_dates=True)wrk_df = pd.read_csv(wrk_data, index_col="date", infer_datetime_format=True, parse_dates=True)# Display a few rowswrk_df.head()| close | |
|---|---|
| date | |
| 2019-05-20 | 35.15 |
| 2019-05-17 | 36.66 |
| 2019-05-16 | 37.42 |
| 2019-05-15 | 37.44 |
| 2019-05-14 | 37.19 |
xxxxxxxxxxCombine DataFrames, Sort Index, and Rename Columns¶
# Create a new pivot table where the columns are the closing prices for each tickercombined_df = pd.concat([bk_df, fang_df, jnj_df, luv_df, mu_df, nke_df, sbux_df, t_df, wdc_df, wrk_df], axis="columns", join="inner")# Sort datetime index in ascending order (past to present)combined_df.sort_index(inplace=True)# Set column namescombined_df.columns = ['BK', 'FANG', 'JNJ', 'LUV', 'MU', 'NKE', 'SBUX', 'T', 'WDC', 'WRK']# Display a few rowscombined_df.head()combined_df.columns = ['BK', 'FANG', 'JNJ', 'LUV', 'MU', 'NKE', 'SBUX', 'T', 'WDC', 'WRK']| BK | FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | WRK | |
|---|---|---|---|---|---|---|---|---|---|---|
| date | ||||||||||
| 2015-06-24 | 43.29 | 77.96 | 99.33 | 34.35 | 24.06 | 53.110 | 53.710 | 35.78 | 87.73 | 58.5699 |
| 2015-06-25 | 42.85 | 76.39 | 99.12 | 34.49 | 24.02 | 52.610 | 54.070 | 36.18 | 85.95 | 57.2022 |
| 2015-06-26 | 42.98 | 75.25 | 99.64 | 34.32 | 19.66 | 54.855 | 54.620 | 36.12 | 84.45 | 57.2022 |
| 2015-06-29 | 41.79 | 75.16 | 97.68 | 33.18 | 18.73 | 53.835 | 53.550 | 35.77 | 81.96 | 56.1857 |
| 2015-06-30 | 41.97 | 75.38 | 97.46 | 33.09 | 18.84 | 54.010 | 53.615 | 35.52 | 78.42 | 56.2781 |
xxxxxxxxxxCalculate Daily Returns¶
# Use the `pct_change` function to calculate daily returnsdaily_returns = combined_df.pct_change()daily_returns.head()| BK | FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | WRK | |
|---|---|---|---|---|---|---|---|---|---|---|
| date | ||||||||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.010164 | -0.020139 | -0.002114 | 0.004076 | -0.001663 | -0.009414 | 0.006703 | 0.011179 | -0.020290 | -0.023352 |
| 2015-06-26 | 0.003034 | -0.014923 | 0.005246 | -0.004929 | -0.181515 | 0.042672 | 0.010172 | -0.001658 | -0.017452 | 0.000000 |
| 2015-06-29 | -0.027687 | -0.001196 | -0.019671 | -0.033217 | -0.047304 | -0.018594 | -0.019590 | -0.009690 | -0.029485 | -0.017770 |
| 2015-06-30 | 0.004307 | 0.002927 | -0.002252 | -0.002712 | 0.005873 | 0.003251 | 0.001214 | -0.006989 | -0.043192 | 0.001645 |
xxxxxxxxxxEvaluate Riskiness of Stocks¶
# Use the `std` function and multiply by the square root of the number of trading days in a year to get annualized volatilityvolatility = daily_returns.std() * np.sqrt(252)volatility.sort_values(inplace=True)volatilityJNJ 0.161128 T 0.187579 SBUX 0.206805 BK 0.224967 NKE 0.246045 LUV 0.280021 WRK 0.303601 FANG 0.370704 WDC 0.415605 MU 0.479955 dtype: float64
xxxxxxxxxxDrop High Volatility Stocks¶
# Drop the 5 stocks with the highest volatility in daily returnsdaily_returns.drop(columns=['MU', 'WDC', 'FANG', 'WRK', 'LUV'], inplace=True)daily_returns.head()| BK | JNJ | NKE | SBUX | T | |
|---|---|---|---|---|---|
| date | |||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.010164 | -0.002114 | -0.009414 | 0.006703 | 0.011179 |
| 2015-06-26 | 0.003034 | 0.005246 | 0.042672 | 0.010172 | -0.001658 |
| 2015-06-29 | -0.027687 | -0.019671 | -0.018594 | -0.019590 | -0.009690 |
| 2015-06-30 | 0.004307 | -0.002252 | 0.003251 | 0.001214 | -0.006989 |
xxxxxxxxxxSet Portfolio Allocations/Weights and Calculate Portfolio Daily Returns¶
# Set weights for corresponding risk profile of stocks, use the `dot` function to sum the product each weight and the corresponding stock daily return# BK, LUV, NKE, SBUX, Tweights = [0.15, 0.05, 0.10, 0.2, 0.50]portfolio_returns = daily_returns.dot(weights)portfolio_returns.head()date 2015-06-24 NaN 2015-06-25 0.004358 2015-06-26 0.006190 2015-06-29 -0.015759 2015-06-30 -0.002393 dtype: float64
xxxxxxxxxxCalculate Cumulative Returns¶
# Use the `cumprod` function to calculate cumulative returnscumulative_returns = (1 + portfolio_returns).cumprod()cumulative_returns.head()date 2015-06-24 NaN 2015-06-25 1.004358 2015-06-26 1.010575 2015-06-29 0.994650 2015-06-30 0.992269 dtype: float64
xxxxxxxxxxPlot Return of Portfolio Starting with Initial Investment of $10,000¶
# Plot the returns of the portfolio in terms of moneyinitial_investment = 10000cumulative_profit = initial_investment * cumulative_returnscumulative_profit.plot()<matplotlib.axes._subplots.AxesSubplot at 0x1a263039d0>
xxxxxxxxxxxxxxxxxxxxPART 2: Portfolio Optimization via Correlation and Return-to-risk (Sharpe Ratio) Evaluations¶
xxxxxxxxxxReset Daily Returns DataFrame¶
# Re-calculate daily returns as the DataFrame was modified in part 1daily_returns = combined_df.pct_change()daily_returns.head()| BK | FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | WRK | |
|---|---|---|---|---|---|---|---|---|---|---|
| date | ||||||||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.010164 | -0.020139 | -0.002114 | 0.004076 | -0.001663 | -0.009414 | 0.006703 | 0.011179 | -0.020290 | -0.023352 |
| 2015-06-26 | 0.003034 | -0.014923 | 0.005246 | -0.004929 | -0.181515 | 0.042672 | 0.010172 | -0.001658 | -0.017452 | 0.000000 |
| 2015-06-29 | -0.027687 | -0.001196 | -0.019671 | -0.033217 | -0.047304 | -0.018594 | -0.019590 | -0.009690 | -0.029485 | -0.017770 |
| 2015-06-30 | 0.004307 | 0.002927 | -0.002252 | -0.002712 | 0.005873 | 0.003251 | 0.001214 | -0.006989 | -0.043192 | 0.001645 |
xxxxxxxxxxEvaluate Correlation of Stock Returns¶
# Use the `corr` function to calculate the correlation between stock returnscorrelation = daily_returns.corr()correlation| BK | FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | WRK | |
|---|---|---|---|---|---|---|---|---|---|---|
| BK | 1.000000 | 0.316856 | 0.350064 | 0.390731 | 0.339511 | 0.352312 | 0.361186 | 0.284662 | 0.373527 | 0.473280 |
| FANG | 0.316856 | 1.000000 | 0.221352 | 0.169883 | 0.257745 | 0.228170 | 0.211182 | 0.186524 | 0.259662 | 0.294395 |
| JNJ | 0.350064 | 0.221352 | 1.000000 | 0.248263 | 0.173480 | 0.297197 | 0.366453 | 0.343575 | 0.223365 | 0.295180 |
| LUV | 0.390731 | 0.169883 | 0.248263 | 1.000000 | 0.296751 | 0.285755 | 0.314467 | 0.220850 | 0.273768 | 0.294395 |
| MU | 0.339511 | 0.257745 | 0.173480 | 0.296751 | 1.000000 | 0.196334 | 0.247012 | 0.174143 | 0.609577 | 0.405658 |
| NKE | 0.352312 | 0.228170 | 0.297197 | 0.285755 | 0.196334 | 1.000000 | 0.398614 | 0.240217 | 0.242891 | 0.271221 |
| SBUX | 0.361186 | 0.211182 | 0.366453 | 0.314467 | 0.247012 | 0.398614 | 1.000000 | 0.238597 | 0.281220 | 0.303981 |
| T | 0.284662 | 0.186524 | 0.343575 | 0.220850 | 0.174143 | 0.240217 | 0.238597 | 1.000000 | 0.153232 | 0.281383 |
| WDC | 0.373527 | 0.259662 | 0.223365 | 0.273768 | 0.609577 | 0.242891 | 0.281220 | 0.153232 | 1.000000 | 0.397666 |
| WRK | 0.473280 | 0.294395 | 0.295180 | 0.294395 | 0.405658 | 0.271221 | 0.303981 | 0.281383 | 0.397666 | 1.000000 |
# Use the `heatmap` function from the Seaborn library to visualize the correlation tablesns.heatmap(correlation, vmin=-1, vmax=1)<matplotlib.axes._subplots.AxesSubplot at 0x1a2655c710>
# Alternatively, sum total correlation for each stockcorrelation.sum(axis=0).sort_values()T 3.123182 FANG 3.145768 LUV 3.494862 NKE 3.512710 JNJ 3.518928 MU 3.700211 SBUX 3.722712 WDC 3.814908 WRK 4.017159 BK 4.242129 dtype: float64
xxxxxxxxxxDrop the Two Most Consistently Correlated Stocks¶
# Drop highly correlated stocks and keep only the non-correlated stocksnoncorrelated_daily_returns = daily_returns.drop(columns=['WRK', 'BK'])noncorrelated_daily_returns.head()| FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | |
|---|---|---|---|---|---|---|---|---|
| date | ||||||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.020139 | -0.002114 | 0.004076 | -0.001663 | -0.009414 | 0.006703 | 0.011179 | -0.020290 |
| 2015-06-26 | -0.014923 | 0.005246 | -0.004929 | -0.181515 | 0.042672 | 0.010172 | -0.001658 | -0.017452 |
| 2015-06-29 | -0.001196 | -0.019671 | -0.033217 | -0.047304 | -0.018594 | -0.019590 | -0.009690 | -0.029485 |
| 2015-06-30 | 0.002927 | -0.002252 | -0.002712 | 0.005873 | 0.003251 | 0.001214 | -0.006989 | -0.043192 |
xxxxxxxxxxNote: selecting which two to drop is a little subjective: here WRK and BK were dropped because looking at the heatmap form of the correlation matrix, they appeared to have the consistently highest temperatures (lightest colors); in other words, they had the consistently highest correlation with all of the other stocks.
If we wanted to be more systematic, one potential approach would be to sum each stock's total correlation and then compare. The code for that approach, shown just above, produces the same conclusions.
xxxxxxxxxxEvaluate Reward-to-Risk (Sharpe Ratio) of Minimally-Correlated Stocks¶
# Use the `mean` and `std` functions to calculate the annualized sharpe ratiosharpe_ratios = (noncorrelated_daily_returns.mean() * 252) / (noncorrelated_daily_returns.std() * np.sqrt(252))sharpe_ratiossharpe_ratiosFANG 0.421266 JNJ 0.609743 LUV 0.520850 MU 0.435027 NKE 0.585923 SBUX 0.545402 T -0.050216 WDC -0.245731 dtype: float64
xxxxxxxxxxDrop the Three Lowest Sharpe Ratio Stocks¶
# Drop the lowest sharpe ratio stocks and keep only the postive sharpe ratio stockscandidate_daily_returns = noncorrelated_daily_returns.drop(columns=['T', 'WDC', 'FANG'])candidate_daily_returns.head()| JNJ | LUV | MU | NKE | SBUX | |
|---|---|---|---|---|---|
| date | |||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.002114 | 0.004076 | -0.001663 | -0.009414 | 0.006703 |
| 2015-06-26 | 0.005246 | -0.004929 | -0.181515 | 0.042672 | 0.010172 |
| 2015-06-29 | -0.019671 | -0.033217 | -0.047304 | -0.018594 | -0.019590 |
| 2015-06-30 | -0.002252 | -0.002712 | 0.005873 | 0.003251 | 0.001214 |
xxxxxxxxxxEvaluate Investment in Minimally-Correlated (Diversified) and Return-to-Risk (Sharpe Ratio) Optimized Portfolio¶
# Set Portfolio Weights, Calculate Daily and Cumulative Portfolio Returns, and Plot $10,000 Investment Over Time for Optimized Portfolioinitial_investment = 10000weights = [.2, .2, .2, .2, .2]candidate_portfolio_returns = candidate_daily_returns.dot(weights)candidate_cumulative_returns = (1 + candidate_portfolio_returns).cumprod()candidate_cumulative_profits = (initial_investment * candidate_cumulative_returns)candidate_cumulative_profits.plot(figsize=(20,10))<matplotlib.axes._subplots.AxesSubplot at 0x1a27043d90>
xxxxxxxxxxEvaluate Investment in Non-Correlated (Diversified) Portfolio¶
# Set Portfolio Weights, Calculate Daily and Cumulative Portfolio Returns, and Plot $10,000 Investment Over Time for Uncorrelated Portfolioinitial_investment = 10000weights = [0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125]noncorrelated_portfolio_returns = noncorrelated_daily_returns.dot(weights)noncorrelated_cumulative_returns = (1 + noncorrelated_portfolio_returns).cumprod()noncorrelated_cumulative_profits = (initial_investment * noncorrelated_cumulative_returns)noncorrelated_cumulative_profits.plot(figsize=(20,10))<matplotlib.axes._subplots.AxesSubplot at 0x1a274cd490>
xxxxxxxxxxEvaluate Investment in Original/Unoptimized Portfolio (No Modifications)¶
# Set Portfolio Weights, Calculate Daily and Cumulative Portfolio Returns, and Plot $10,000 Investment Over Time for Unoptimized Portfolioinitial_investment = 10000weights = [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]unoptimized_portfolio_returns = daily_returns.dot(weights)unoptimized_cumulative_returns = (1 + unoptimized_portfolio_returns).cumprod()unoptimized_cumulative_profits = (initial_investment * unoptimized_cumulative_returns)unoptimized_cumulative_profits.plot(figsize=(20,10))# Set Portfolio Weights, Calculate Daily and Cumulative Portfolio Returns, and Plot $10,000 Investment Over Time for Unoptimized Portfolio<matplotlib.axes._subplots.AxesSubplot at 0x1a278571d0>
xxxxxxxxxxOverlay Hypothetical Portfolio Investments¶
# Set the figure of the chart and plot each data trend on the single figureax = candidate_cumulative_profits.plot(figsize=(20,10))noncorrelated_cumulative_profits.plot(ax=ax)unoptimized_cumulative_profits.plot(ax=ax)cumulative_profit.plot(ax=ax)ax.legend(['PART 2: Diversified and Return-to-Risk Optimized Portfolio $10,000', 'PART 2: Diversified Portfolio $10,000', 'PART 2: Original/Unoptimized Portfolio $10,000', 'PART 1: Risk-Optimized Portfolio $10,000'])<matplotlib.legend.Legend at 0x1a27e8e950>
xxxxxxxxxxStudent Activity: Portfolio Planner Part I¶
This program is part 1 of a two-part student activity and begins the following:
PART 1: Portfolio Optimization via Risk Evaluation
Reads in the CSV datasets of 10 stocks, calculates the volatility of each stock, drops the top-five highly volatile stocks, sets allocations for the remaining stocks based on risk/volatility, and calculates the returns of a hypothetical $10,000 investment for the constructed portfolio.
PART 2: Portfolio Optimization via Correlation and Return-to-Risk (Sharpe Ratio) Evaluations
To be continued in the second part!
# Import libraries and dependenciesimport numpy as npimport pandas as pdfrom pathlib import Path%matplotlib inlinexxxxxxxxxxRead CSVs as DataFrames¶
# Set the file pathsbk_data = Path("../Resources/bk_data.csv")fang_data = Path("../Resources/fang_data.csv")jnj_data = Path("../Resources/jnj_data.csv")luv_data = Path("../Resources/luv_data.csv")mu_data = Path("../Resources/mu_data.csv")nke_data = Path("../Resources/nke_data.csv")sbux_data = Path("../Resources/sbux_data.csv")t_data = Path("../Resources/t_data.csv")wdc_data = Path("../Resources/wdc_data.csv")wrk_data = Path("../Resources/wrk_data.csv")# Read the CSVs and set the `date` column as a datetime index to the DataFramebk_df = pd.read_csv(bk_data, index_col="date", infer_datetime_format=True, parse_dates=True)fang_df = pd.read_csv(fang_data, index_col="date", infer_datetime_format=True, parse_dates=True)jnj_df = pd.read_csv(jnj_data, index_col="date", infer_datetime_format=True, parse_dates=True)luv_df = pd.read_csv(luv_data, index_col="date", infer_datetime_format=True, parse_dates=True)mu_df = pd.read_csv(mu_data, index_col="date", infer_datetime_format=True, parse_dates=True)nke_df = pd.read_csv(nke_data, index_col="date", infer_datetime_format=True, parse_dates=True)sbux_df = pd.read_csv(sbux_data, index_col="date", infer_datetime_format=True, parse_dates=True)t_df = pd.read_csv(t_data, index_col="date", infer_datetime_format=True, parse_dates=True)wdc_df = pd.read_csv(wdc_data, index_col="date", infer_datetime_format=True, parse_dates=True)wrk_df = pd.read_csv(wrk_data, index_col="date", infer_datetime_format=True, parse_dates=True)# Display a few rowswrk_df.head()| close | |
|---|---|
| date | |
| 2019-05-20 | 35.15 |
| 2019-05-17 | 36.66 |
| 2019-05-16 | 37.42 |
| 2019-05-15 | 37.44 |
| 2019-05-14 | 37.19 |
xxxxxxxxxxCombine DataFrames, Sort Index, and Rename Columns¶
# Create a new pivot table where the columns are the closing prices for each tickercombined_df = pd.concat([bk_df, fang_df, jnj_df, luv_df, mu_df, nke_df, sbux_df, t_df, wdc_df, wrk_df], axis="columns", join="inner")# Sort datetime index in ascending order (past to present)combined_df.sort_index(inplace=True)# Set column namescombined_df.columns = ['BK', 'FANG', 'JNJ', 'LUV', 'MU', 'NKE', 'SBUX', 'T', 'WDC', 'WRK']# Display a few rowscombined_df.head()| BK | FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | WRK | |
|---|---|---|---|---|---|---|---|---|---|---|
| date | ||||||||||
| 2015-06-24 | 43.29 | 77.955 | 99.33 | 34.35 | 24.06 | 53.110 | 53.710 | 35.78 | 87.73 | 58.5699 |
| 2015-06-25 | 42.85 | 76.390 | 99.12 | 34.49 | 24.02 | 52.610 | 54.070 | 36.18 | 85.95 | 57.2022 |
| 2015-06-26 | 42.98 | 75.250 | 99.64 | 34.32 | 19.66 | 54.855 | 54.620 | 36.12 | 84.45 | 57.2022 |
| 2015-06-29 | 41.79 | 75.160 | 97.68 | 33.18 | 18.73 | 53.835 | 53.550 | 35.77 | 81.96 | 56.1857 |
| 2015-06-30 | 41.97 | 75.380 | 97.46 | 33.09 | 18.84 | 54.010 | 53.615 | 35.52 | 78.42 | 56.2781 |
xxxxxxxxxxCalculate Daily Returns¶
# Use the `pct_change` function to calculate daily returnsdaily_returns = combined_df.pct_change()daily_returns.head()| BK | FANG | JNJ | LUV | MU | NKE | SBUX | T | WDC | WRK | |
|---|---|---|---|---|---|---|---|---|---|---|
| date | ||||||||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.010164 | -0.020076 | -0.002114 | 0.004076 | -0.001663 | -0.009414 | 0.006703 | 0.011179 | -0.020290 | -0.023352 |
| 2015-06-26 | 0.003034 | -0.014923 | 0.005246 | -0.004929 | -0.181515 | 0.042672 | 0.010172 | -0.001658 | -0.017452 | 0.000000 |
| 2015-06-29 | -0.027687 | -0.001196 | -0.019671 | -0.033217 | -0.047304 | -0.018594 | -0.019590 | -0.009690 | -0.029485 | -0.017770 |
| 2015-06-30 | 0.004307 | 0.002927 | -0.002252 | -0.002712 | 0.005873 | 0.003251 | 0.001214 | -0.006989 | -0.043192 | 0.001645 |
xxxxxxxxxxEvaluate Riskiness of Stocks¶
# Use the `std` function and multiply by the square root of the number of trading days in a year to get annualized volatilityvolatility = daily_returns.std() * np.sqrt(252)volatility.sort_values(inplace=True)volatilityJNJ 0.161208 T 0.187571 SBUX 0.206388 BK 0.225078 NKE 0.245942 LUV 0.280153 WRK 0.303048 FANG 0.370971 WDC 0.414714 MU 0.479754 dtype: float64
xxxxxxxxxxDrop High Volatility Stocks¶
# Drop the five stocks with the highest volatility in daily returnsdaily_returns.drop(columns=['MU', 'WDC', 'JNJ', 'FANG', 'WRK'], inplace=True)daily_returns.head()| BK | LUV | NKE | SBUX | T | |
|---|---|---|---|---|---|
| date | |||||
| 2015-06-24 | NaN | NaN | NaN | NaN | NaN |
| 2015-06-25 | -0.010164 | 0.004076 | -0.009414 | 0.006703 | 0.011179 |
| 2015-06-26 | 0.003034 | -0.004929 | 0.042672 | 0.010172 | -0.001658 |
| 2015-06-29 | -0.027687 | -0.033217 | -0.018594 | -0.019590 | -0.009690 |
| 2015-06-30 | 0.004307 | -0.002712 | 0.003251 | 0.001214 | -0.006989 |
xxxxxxxxxxSet Portfolio Allocations/Weights and Calculate Portfolio Daily Returns¶
# Set weights for corresponding risk profile of stocks, use the `dot` function to multiply each weight by the corresponding stock daily return# BK, LUV, NKE, SBUX, Tweights = [0.15, 0.05, 0.10, 0.2, 0.50]portfolio_returns = daily_returns.dot(weights)portfolio_returns.head()date 2015-06-24 NaN 2015-06-25 0.004668 2015-06-26 0.005681 2015-06-29 -0.016436 2015-06-30 -0.002416 dtype: float64
xxxxxxxxxxCalculate Cumulative Returns¶
# Use the `cumprod` function to calculate cumulative returnscumulative_returns = (1 + portfolio_returns).cumprod()cumulative_returns.head()date 2015-06-24 NaN 2015-06-25 1.004668 2015-06-26 1.010376 2015-06-29 0.993769 2015-06-30 0.991368 dtype: float64
xxxxxxxxxxPlot Return of Portfolio Starting with Initial Investment of $10,000¶
# Plot the returns of the portfolio in terms of moneyinitial_investment = 10000cumulative_profit = initial_investment * cumulative_returnscumulative_profit.plot()<AxesSubplot:xlabel='date'>